• A comparison of performance of K-complex classification methods using feature selection 

      Hernández-Pereira, Elena; Bolón-Canedo, Verónica; Sánchez-Maroño, Noelia; Álvarez-Estévez, Diego; Moret-Bonillo, Vicente; Alonso-Betanzos, Amparo (2016-01-20)
      [Abstract] The main objective of this work is to obtain a method that achieves the best accuracy results with a low false positive rate in the classification of K-complexes, a kind of transient waveform found in the ...
    • A scalable decision-tree-based method to explain interactions in dyadic data 

      Eiras-Franco, Carlos; Guijarro-Berdiñas, Bertha; Alonso-Betanzos, Amparo; Bahamonde, Antonio (Elsevier, 2019-12)
      [Abstract]: Gaining relevant insight from a dyadic dataset, which describes interactions between two entities, is an open problem that has sparked the interest of researchers and industry data scientists alike. However, ...
    • A scalable saliency-based feature selection method with instance-level information 

      Cancela, Brais; Bolón-Canedo, Verónica; Alonso-Betanzos, Amparo; Gama, João (Elsevier, 2019-11)
      [Abstract]: Classic feature selection techniques remove irrelevant or redundant features to achieve a subset of relevant features in compact models that are easier to interpret and so improve knowledge extraction. Most ...
    • An Agent-Based Model to Simulate the Spread of a Virus Based on Social Behavior and Containment Measures 

      Seijas Carpente, Manuel; Guijarro-Berdiñas, Bertha; Alonso-Betanzos, Amparo; Rodríguez-Arias, Alejandro; Dumitru, Adina (MDPI AG, 2020-08-20)
      [Abstract] COVID-19 has brought a new normality in society. However, to avoid the situation, the virus must be stopped. There are several ways in which the governments of the world have taken action, from small measures ...
    • Anomaly Detection on Natural Language Processing to Improve Predictions on Tourist Preferences 

      Meira, Jorge; Carneiro, João; Bolón-Canedo, Verónica; Alonso-Betanzos, Amparo; Novais, Paulo; Marreiros, Goreti (MDPI, 2022)
      [Abstract] Argumentation-based dialogue models have shown to be appropriate for decision contexts in which it is intended to overcome the lack of interaction between decision-makers, either because they are dispersed, they ...
    • Community detection and social network analysis based on the Italian wars of the 15th century 

      Fumanal-Idocin, Javier; Alonso-Betanzos, Amparo; Cordón, Oscar; Bustince, Humberto; Minárová, Mária (Elsevier, 2020)
      [Abstract]: In this contribution we study social network modelling by using human interaction as a basis. To do so, we propose a new set of functions, affinities, designed to capture the nature of the local interactions ...
    • Data-driven predictive maintenance framework for railway systems 

      Meira, Jorge; Veloso, Bruno; Bolón-Canedo, Verónica; Marreiros, Goreti; Alonso-Betanzos, Amparo; Gama, João (IOS Press, 2023)
      [Abstract]: The emergence of the Industry 4.0 trend brings automation and data exchange to industrial manufacturing. Using computational systems and IoT devices allows businesses to collect and deal with vast volumes of ...
    • Dealing with heterogeneity in the context of distributed feature selection for classification 

      Morillo-Salas, José Luis; Bolón-Canedo, Verónica; Alonso-Betanzos, Amparo (Springer, 2021)
      [Abstract]: Advances in the information technologies have greatly contributed to the advent of larger datasets. These datasets often come from distributed sites, but even so, their large size usually means they cannot be ...
    • Distributed classification based on distances between probability distributions in feature space 

      Montero Manso, Pablo; Morán-Fernández, Laura; Bolón-Canedo, Verónica; Vilar, José; Alonso-Betanzos, Amparo (Elsevier, 2019-09)
      [Abstract]: We consider a distributed framework where training and test samples drawn from the same distribution are available, with the training instances spread across disjoint nodes. In this setting, a novel learning ...
    • Distributed correlation-based feature selection in spark 

      Palma Mendoza, Raúl José; Marcos, Luis de; Rodríguez, Daniel; Alonso-Betanzos, Amparo (Elsevier, 2019-09)
      [Abstract]: Feature selection (FS) is a key preprocessing step in data mining. CFS (Correlation-Based Feature Selection) is an FS algorithm that has been successfully applied to classification problems in many domains. We ...
    • E2E-FS: An End-to-End Feature Selection Method for Neural Networks 

      Cancela, Brais; Bolón-Canedo, Verónica; Alonso-Betanzos, Amparo (IEEE, 2023-07)
      [Abstract]: Classic embedded feature selection algorithms are often divided in two large groups: tree-based algorithms and LASSO variants. Both approaches are focused in different aspects: while the tree-based algorithms ...
    • Ensembles for feature selection: A review and future trends 

      Bolón-Canedo, Verónica; Alonso-Betanzos, Amparo (Elsevier, 2019)
      [Abstract]: Ensemble learning is a prolific field in Machine Learning since it is based on the assumption that combining the output of multiple models is better than using a single model, and it usually provides good ...
    • Fast anomaly detection with locality-sensitive hashing and hyperparameter autotuning 

      Meira, Jorge; Eiras-Franco, Carlos; Bolón-Canedo, Verónica; Marreiros, Goreti; Alonso-Betanzos, Amparo (Elsevier, 2022-08)
      [Abstract]: This paper presents LSHAD, an anomaly detection (AD) method based on Locality Sensitive Hashing (LSH), capable of dealing with large-scale datasets. The resulting algorithm is highly parallelizable and its ...
    • Fast Distributed kNN Graph Construction Using Auto-tuned Locality-sensitive Hashing 

      Eiras-Franco, Carlos; Martínez Rego, David; Kanthan, Leslie; Piñeiro, César; Bahamonde, Antonio; Guijarro-Berdiñas, Bertha; Alonso-Betanzos, Amparo (Association for Computing Machinery, 2020)
      [Abstract]: The k-nearest-neighbors (kNN) graph is a popular and powerful data structure that is used in various areas of Data Science, but the high computational cost of obtaining it hinders its use on large datasets. ...
    • Feature Selection With Limited Bit Depth Mutual Information for Embedded Systems 

      Morán-Fernández, Laura; Bolón-Canedo, Verónica; Alonso-Betanzos, Amparo (MDPI AG, 2018-09-17)
      [Abstract] Data is growing at an unprecedented pace. With the variety, speed and volume of data flowing through networks and databases, newer approaches based on machine learning are required. But what is really big in Big ...
    • How Agent-based modeling can help to foster sustainability projects 

      Sánchez-Maroño, Noelia; Rodríguez Arias, Alejandro; Dumitru, Adina; Lema-Blanco, Isabel; Guijarro-Berdiñas, Bertha; Alonso-Betanzos, Amparo (Elsevier, 2022)
      [Abstract] The Sustainable Development Goals (SDGs) adopted by the United Nations require relevant social changes that sometimes involve the development of innovative projects that cause rejection and confrontation. ...
    • How Important Is Data Quality? Best Classifiers vs Best Features 

      Morán-Fernández, Laura; Bolón-Canedo, Verónica; Alonso-Betanzos, Amparo (Elsevier, 2021)
      [Abstract] The task of choosing the appropriate classifier for a given scenario is not an easy-to-solve question. First, there is an increasingly high number of algorithms available belonging to different families. And ...
    • Insights into distributed feature ranking 

      Bolón-Canedo, Verónica; Sechidis, Konstantinos; Sánchez-Maroño, Noelia; Alonso-Betanzos, Amparo; Brown, Gavin (Elsevier, 2019)
      [Abstract]: In an era in which the volume and complexity of datasets is continuously growing, feature selection techniques have become indispensable to extract useful information from huge amounts of data. However, existing ...
    • Interpretable market segmentation on high dimension data 

      Eiras-Franco, Carlos; Guijarro-Berdiñas, Bertha; Alonso-Betanzos, Amparo; Bahamonde, Antonio (M D P I AG, 2018-09-17)
      [Abstract] Obtaining relevant information from the vast amount of data generated by interactions in a market or, in general, from a dyadic dataset, is a broad problem of great interest both for industry and academia. Also, ...
    • Large scale anomaly detection in mixed numerical and categorical input spaces 

      Eiras-Franco, Carlos; Martínez Rego, David; Guijarro-Berdiñas, Bertha; Alonso-Betanzos, Amparo; Bahamonde, Antonio (Elsevier, 2019)
      [Abstract]: This work presents the ADMNC method, designed to tackle anomaly detection for large-scale problems with a mixture of categorical and numerical input variables. A flexible parametric probability measure is ...